Search | VHL Regional Portal

1.

Assessing variability in non-contrast CT for the evaluation of stroke: The effect of CT image reconstruction conditions on AI-based CAD measurements of ASPECTS value and hypodense volume.

Welland, Spencer H; Kim, Grace Hyun J; Yadav, Anil; Hoffman, John M; Hsu, William; Brown, Matthew S; Tavakkol, Elham; Nael, Kambiz; McNitt-Gray, Michael F.

Proc SPIE Int Soc Opt Eng ; 129272024 Feb.

Article in English | MEDLINE | ID: mdl-38645463

ABSTRACT

Purpose: To rule out hemorrhage, non-contrast CT (NCCT) scans are used for early evaluation of patients with suspected stroke. Recently, artificial intelligence tools have been developed to assist with determining eligibility for reperfusion therapies by automating measurement of the Alberta Stroke Program Early CT Score (ASPECTS), a 10-point scale with > 7 or ≤ 7 being a threshold for change in functional outcome prediction and higher chance of symptomatic hemorrhage, and hypodense volume. The purpose of this work was to investigate the effects of CT reconstruction kernel and slice thickness on ASPECTS and hypodense volume. Methods: The NCCT series image data of 87 patients imaged with a CT stroke protocol at our institution were reconstructed with 3 kernels (H10s-smooth, H40s-medium, H70h-sharp) and 2 slice thicknesses (1.5mm and 5mm) to create a reference condition (H40s/5mm) and 5 non-reference conditions. Each reconstruction for each patient was analyzed with the Brainomix e-Stroke software (Brainomix, Oxford, England) which yields an ASPECTS value and measure of total hypodense volume (mL). Results: An ASPECTS value was returned for 74 of 87 cases in the reference condition (13 failures). ASPECTS in non-reference conditions changed from that measured in the reference condition for 59 cases, 7 of which changed above or below the clinical threshold of 7 for 3 non-reference conditions. ANOVA tests were performed to compare the differences in protocols, Dunnett's post-hoc tests were performed after ANOVA, and a significance level of p < 0.05 was defined. There was no significant effect of kernel (p = 0.91), a significant effect of slice thickness (p < 0.01) and no significant interaction between these factors (p = 0.91). Post-hoc tests indicated no significant difference between ASPECTS estimated in the reference and any non-reference conditions. There was a significant effect of kernel (p < 0.01) and slice thickness (p < 0.01) on hypodense volume, however there was no significant interaction between these factors (p = 0.79). Post-hoc tests indicated significantly different hypodense volume measurements for H10s/1.5mm (p = 0.03), H40s/1.5mm (p < 0.01), H70h/5mm (p < 0.01). No significant difference was found in hypodense volume measured in the H10s/5mm condition (p = 0.96). Conclusion: Automated ASPECTS and hypodense volume measurements can be significantly impacted by reconstruction kernel and slice thickness.

2.

Transportability Analysis-A Tool for Extending Trial Results to a Representative Target Population.

Inoue, Kosuke; Hsu, William.

JAMA Netw Open ; 7(1): e2346302, 2024 Jan 02.

Article in English | MEDLINE | ID: mdl-38289608

3.

Towards a framework for interoperability and reproducibility of predictive models.

Rahrooh, Al; Garlid, Anders O; Bartlett, Kelly; Coons, Warren; Petousis, Panayiotis; Hsu, William; Bui, Alex A T.

J Biomed Inform ; 149: 104551, 2024 Jan.

Article in English | MEDLINE | ID: mdl-38000765

ABSTRACT

The development and deployment of machine learning (ML) models for biomedical research and healthcare currently lacks standard methodologies. Although tools for model replication are numerous, without a unifying blueprint it remains difficult to scientifically reproduce predictive ML models for any number of reasons (e.g., assumptions regarding data distributions and preprocessing, unclear test metrics, etc.) and ultimately, questions around generalizability and transportability are not readily answered. To facilitate scientific reproducibility, we built upon the Predictive Model Markup Language (PMML) to capture essential information. As a key component of the PREdictive Model Index and Exchange REpository (PREMIERE) platform, we present the Automated Metadata Pipeline (AMP) for conversion of a given predictive ML model into an extended PMML file that autocompletes an ML-based checklist, assessing model elements for interoperability and reproducibility. We demonstrate this pipeline on multiple test cases with three different ML algorithms and health-related datasets, providing a foundation for future predictive model reproducibility, sharing, and comparison.

Subject(s)

Biomedical Research , Reproducibility of Results , Algorithms , Records , Metadata

4.

Expanding Role of Advanced Image Analysis in CT-detected Indeterminate Pulmonary Nodules and Early Lung Cancer Characterization.

Prosper, Ashley Elizabeth; Kammer, Michael N; Maldonado, Fabien; Aberle, Denise R; Hsu, William.

Radiology ; 309(1): e222904, 2023 10.

Article in English | MEDLINE | ID: mdl-37815447

ABSTRACT

The implementation of low-dose chest CT for lung screening presents a crucial opportunity to advance lung cancer care through early detection and interception. In addition, millions of pulmonary nodules are incidentally detected annually in the United States, increasing the opportunity for early lung cancer diagnosis. Yet, realization of the full potential of these opportunities is dependent on the ability to accurately analyze image data for purposes of nodule classification and early lung cancer characterization. This review presents an overview of traditional image analysis approaches in chest CT using semantic characterization as well as more recent advances in the technology and application of machine learning models using CT-derived radiomic features and deep learning architectures to characterize lung nodules and early cancers. Methodological challenges currently faced in translating these decision aids to clinical practice, as well as the technical obstacles of heterogeneous imaging parameters, optimal feature selection, choice of model, and the need for well-annotated image data sets for the purposes of training and validation, will be reviewed, with a view toward the ultimate incorporation of these potentially powerful decision aids into routine clinical practice.

Subject(s)

Lung Neoplasms , Multiple Pulmonary Nodules , Humans , Lung Neoplasms/diagnostic imaging , Multiple Pulmonary Nodules/diagnostic imaging , Image Processing, Computer-Assisted , Tomography, X-Ray Computed

5.

Tailoring pretext tasks to improve self-supervised learning in histopathologic subtype classification of lung adenocarcinomas.

Ding, Ruiwen; Yadav, Anil; Rodriguez, Erika; Araujo Lemos da Silva, Ana Cristina; Hsu, William.

Comput Biol Med ; 166: 107484, 2023 Sep 16.

Article in English | MEDLINE | ID: mdl-37741228

ABSTRACT

Lung adenocarcinoma (LUAD) is a morphologically heterogeneous disease with five predominant histologic subtypes. Fully supervised convolutional neural networks can improve the accuracy and reduce the subjectivity of LUAD histologic subtyping using hematoxylin and eosin (H&E)-stained whole slide images (WSIs). However, developing supervised models with good prediction accuracy usually requires extensive manual data annotation, which is time-consuming and labor-intensive. This work proposes three self-supervised learning (SSL) pretext tasks to reduce labeling effort. These tasks not only leverage the multi-resolution nature of the H&E WSIs but also explicitly consider the relevance to the downstream task of classifying the LUAD histologic subtypes. Two tasks involve predicting the spatial relationship between tiles cropped from lower and higher magnification WSIs. We hypothesize that these tasks induce the model to learn to distinguish different tissue structures presented in the images, thus benefiting the downstream classification. The third task involves predicting the eosin stain from the hematoxylin stain, inducing the model to learn cytoplasmic features relevant to LUAD subtypes. The effectiveness of the three proposed SSL tasks and their ensemble was demonstrated by comparison with other state-of-the-art pretraining and SSL methods using three publicly available datasets. Our work can be extended to any other cancer type where tissue architectural information is important. The model could be used to expedite and complement the process of routine pathology diagnosis tasks. The code is available at https://github.com/rina-ding/ssl_luad_classification.

6.

Comprehensive tissue deconvolution of cell-free DNA by deep learning for disease diagnosis and monitoring.

Li, Shuo; Zeng, Weihua; Ni, Xiaohui; Liu, Qiao; Li, Wenyuan; Stackpole, Mary L; Zhou, Yonggang; Gower, Arjan; Krysan, Kostyantyn; Ahuja, Preeti; Lu, David S; Raman, Steven S; Hsu, William; Aberle, Denise R; Magyar, Clara E; French, Samuel W; Han, Steven-Huy B; Garon, Edward B; Agopian, Vatche G; Wong, Wing Hung; Dubinett, Steven M; Zhou, Xianghong Jasmine.

Proc Natl Acad Sci U S A ; 120(28): e2305236120, 2023 07 11.

Article in English | MEDLINE | ID: mdl-37399400

ABSTRACT

Plasma cell-free DNA (cfDNA) is a noninvasive biomarker for cell death of all organs. Deciphering the tissue origin of cfDNA can reveal abnormal cell death because of diseases, which has great clinical potential in disease detection and monitoring. Despite the great promise, the sensitive and accurate quantification of tissue-derived cfDNA remains challenging to existing methods due to the limited characterization of tissue methylation and the reliance on unsupervised methods. To fully exploit the clinical potential of tissue-derived cfDNA, here we present one of the largest comprehensive and high-resolution methylation atlas based on 521 noncancer tissue samples spanning 29 major types of human tissues. We systematically identified fragment-level tissue-specific methylation patterns and extensively validated them in orthogonal datasets. Based on the rich tissue methylation atlas, we develop the first supervised tissue deconvolution approach, a deep-learning-powered model, cfSort, for sensitive and accurate tissue deconvolution in cfDNA. On the benchmarking data, cfSort showed superior sensitivity and accuracy compared to the existing methods. We further demonstrated the clinical utilities of cfSort with two potential applications: aiding disease diagnosis and monitoring treatment side effects. The tissue-derived cfDNA fraction estimated from cfSort reflected the clinical outcomes of the patients. In summary, the tissue methylation atlas and cfSort enhanced the performance of tissue deconvolution in cfDNA, thus facilitating cfDNA-based disease detection and longitudinal treatment monitoring.

Subject(s)

Cell-Free Nucleic Acids , Deep Learning , Humans , Cell-Free Nucleic Acids/genetics , DNA Methylation , Biomarkers , Promoter Regions, Genetic , Biomarkers, Tumor/genetics

7.

Factors Associated With Nonadherence to Lung Cancer Screening Across Multiple Screening Time Points.

Lin, Yannan; Liang, Li-Jung; Ding, Ruiwen; Prosper, Ashley Elizabeth; Aberle, Denise R; Hsu, William.

JAMA Netw Open ; 6(5): e2315250, 2023 05 01.

Article in English | MEDLINE | ID: mdl-37227725

ABSTRACT

Importance: Screening with low-dose computed tomography (CT) has been shown to reduce mortality from lung cancer in randomized clinical trials in which the rate of adherence to follow-up recommendations was over 90%; however, adherence to Lung Computed Tomography Screening Reporting & Data System (Lung-RADS) recommendations has been low in practice. Identifying patients who are at risk of being nonadherent to screening recommendations may enable personalized outreach to improve overall screening adherence. Objective: To identify factors associated with patient nonadherence to Lung-RADS recommendations across multiple screening time points. Design, Setting, and Participants: This cohort study was conducted at a single US academic medical center across 10 geographically distributed sites where lung cancer screening is offered. The study enrolled individuals who underwent low-dose CT screening for lung cancer between July 31, 2013, and November 30, 2021. Exposures: Low-dose CT screening for lung cancer. Main Outcomes and Measures: The main outcome was nonadherence to follow-up recommendations for lung cancer screening, defined as failing to complete a recommended or more invasive follow-up examination (ie, diagnostic dose CT, positron emission tomography-CT, or tissue sampling vs low-dose CT) within 15 months (Lung-RADS score, 1 or 2), 9 months (Lung-RADS score, 3), 5 months (Lung-RADS score, 4A), or 3 months (Lung-RADS score, 4B/X). Multivariable logistic regression was used to identify factors associated with patient nonadherence to baseline Lung-RADS recommendations. A generalized estimating equations model was used to assess whether the pattern of longitudinal Lung-RADS scores was associated with patient nonadherence over time. Results: Among 1979 included patients, 1111 (56.1%) were aged 65 years or older at baseline screening (mean [SD] age, 65.3 [6.6] years), and 1176 (59.4%) were male. The odds of being nonadherent were lower among patients with a baseline Lung-RADS score of 1 or 2 vs 3 (adjusted odds ratio [AOR], 0.35; 95% CI, 0.25-0.50), 4A (AOR, 0.21; 95% CI, 0.13-0.33), or 4B/X, (AOR, 0.10; 95% CI, 0.05-0.19); with a postgraduate vs college degree (AOR, 0.70; 95% CI, 0.53-0.92); with a family history of lung cancer vs no family history (AOR, 0.74; 95% CI, 0.59-0.93); with a high age-adjusted Charlson Comorbidity Index score (≥4) vs a low score (0 or 1) (AOR, 0.67; 95% CI, 0.46-0.98); in the high vs low income category (AOR, 0.79; 95% CI, 0.65-0.98); and referred by physicians from pulmonary or thoracic-related departments vs another department (AOR, 0.56; 95% CI, 0.44-0.73). Among 830 eligible patients who had completed at least 2 screening examinations, the adjusted odds of being nonadherent to Lung-RADS recommendations at the following screening were increased in patients with consecutive Lung-RADS scores of 1 to 2 (AOR, 1.38; 95% CI, 1.12-1.69). Conclusions and Relevance: In this retrospective cohort study, patients with consecutive negative lung cancer screening results were more likely to be nonadherent with follow-up recommendations. These individuals are potential candidates for tailored outreach to improve adherence to recommended annual lung cancer screening.

Subject(s)

Lung Neoplasms , Humans , Male , Aged , Female , Lung Neoplasms/diagnostic imaging , Cohort Studies , Early Detection of Cancer/methods , Retrospective Studies , Tomography, X-Ray Computed/methods

8.

The Effects of a Fasting Mimicking Diet on Skin Hydration, Skin Texture, and Skin Assessment: A Randomized Controlled Trial.

Maloh, Jessica; Wei, Min; Hsu, William C; Caputo, Sara; Afzal, Najiba; Sivamani, Raja K.

J Clin Med ; 12(5)2023 Feb 21.

Article in English | MEDLINE | ID: mdl-36902498

ABSTRACT

Diet and nutrition have been shown to impact dermatological conditions. This has increased attention toward integrative and lifestyle medicine in the management of skin health. Emerging research around fasting diets, specifically the fasting-mimicking diet (FMD), has provided clinical evidence for chronic inflammatory, cardiometabolic, and autoimmune diseases. In this randomized controlled trial, we evaluated the effects of a five-day FMD protocol, administrated once a month for three months, on facial skin parameters, including skin hydration and skin roughness, in a group of 45 healthy women between the ages of 35 to 60 years old over the course of 71 days. The results of the study revealed that the three consecutive monthly cycles of FMD resulted in a significant percentage increase in skin hydration at day 11 (p = 0.00013) and at day 71 (p = 0.02) relative to baseline. The results also demonstrated maintenance of skin texture in the FMD group compared to an increase in skin roughness in the control group (p = 0.032). In addition to skin biophysical properties, self-reported data also demonstrated significant improvement in components of mental states such as happiness (p = 0.003) and confidence (0.039). Overall, these findings provide evidence for the potential use of FMD in improving skin health and related components of psychological well-being.

9.

Improving the Quantitative Analysis of Breast Microcalcifications: A Multiscale Approach.

Marasinou, Chrysostomos; Li, Bo; Paige, Jeremy; Omigbodun, Akinyinka; Nakhaei, Noor; Hoyt, Anne; Hsu, William.

J Digit Imaging ; 36(3): 1016-1028, 2023 06.

Article in English | MEDLINE | ID: mdl-36820930

ABSTRACT

Accurate characterization of microcalcifications (MCs) in 2D digital mammography is a necessary step toward reducing the diagnostic uncertainty associated with the callback of indeterminate MCs. Quantitative analysis of MCs can better identify MCs with a higher likelihood of ductal carcinoma in situ or invasive cancer. However, automated identification and segmentation of MCs remain challenging with high false positive rates. We present a two-stage multiscale approach to MC segmentation in 2D full-field digital mammograms (FFDMs) and diagnostic magnification views. Candidate objects are first delineated using blob detection and Hessian analysis. A regression convolutional network, trained to output a function with a higher response near MCs, chooses the objects which constitute actual MCs. The method was trained and validated on 435 screening and diagnostic FFDMs from two separate datasets. We then used our approach to segment MCs on magnification views of 248 cases with amorphous MCs. We modeled the extracted features using gradient tree boosting to classify each case as benign or malignant. Compared to state-of-the-art comparison methods, our approach achieved superior mean intersection over the union (0.670 ± 0.121 per image versus 0.524 ± 0.034 per image), intersection over the union per MC object (0.607 ± 0.250 versus 0.363 ± 0.278) and true positive rate of 0.744 versus 0.581 at 0.4 false positive detections per square centimeter. Features generated using our approach outperformed the comparison method (0.763 versus 0.710 AUC) in distinguishing amorphous calcifications as benign or malignant.

Subject(s)

Breast Diseases , Breast Neoplasms , Calcinosis , Humans , Female , Radiographic Image Enhancement/methods , Breast Diseases/diagnostic imaging , Mammography/methods , Calcinosis/diagnostic imaging , Probability , Breast Neoplasms/diagnostic imaging

10.

A Competition, Benchmark, Code, and Data for Using Artificial Intelligence to Detect Lesions in Digital Breast Tomosynthesis.

Konz, Nicholas; Buda, Mateusz; Gu, Hanxue; Saha, Ashirbani; Yang, Jichen; Chledowski, Jakub; Park, Jungkyu; Witowski, Jan; Geras, Krzysztof J; Shoshan, Yoel; Gilboa-Solomon, Flora; Khapun, Daniel; Ratner, Vadim; Barkan, Ella; Ozery-Flato, Michal; Martí, Robert; Omigbodun, Akinyinka; Marasinou, Chrysostomos; Nakhaei, Noor; Hsu, William; Sahu, Pranjal; Hossain, Md Belayat; Lee, Juhun; Santos, Carlos; Przelaskowski, Artur; Kalpathy-Cramer, Jayashree; Bearce, Benjamin; Cha, Kenny; Farahani, Keyvan; Petrick, Nicholas; Hadjiiski, Lubomir; Drukker, Karen; Armato, Samuel G; Mazurowski, Maciej A.

JAMA Netw Open ; 6(2): e230524, 2023 02 01.

Article in English | MEDLINE | ID: mdl-36821110

ABSTRACT

Importance: An accurate and robust artificial intelligence (AI) algorithm for detecting cancer in digital breast tomosynthesis (DBT) could significantly improve detection accuracy and reduce health care costs worldwide. Objectives: To make training and evaluation data for the development of AI algorithms for DBT analysis available, to develop well-defined benchmarks, and to create publicly available code for existing methods. Design, Setting, and Participants: This diagnostic study is based on a multi-institutional international grand challenge in which research teams developed algorithms to detect lesions in DBT. A data set of 22â¯032 reconstructed DBT volumes was made available to research teams. Phase 1, in which teams were provided 700 scans from the training set, 120 from the validation set, and 180 from the test set, took place from December 2020 to January 2021, and phase 2, in which teams were given the full data set, took place from May to July 2021. Main Outcomes and Measures: The overall performance was evaluated by mean sensitivity for biopsied lesions using only DBT volumes with biopsied lesions; ties were broken by including all DBT volumes. Results: A total of 8 teams participated in the challenge. The team with the highest mean sensitivity for biopsied lesions was the NYU B-Team, with 0.957 (95% CI, 0.924-0.984), and the second-place team, ZeDuS, had a mean sensitivity of 0.926 (95% CI, 0.881-0.964). When the results were aggregated, the mean sensitivity for all submitted algorithms was 0.879; for only those who participated in phase 2, it was 0.926. Conclusions and Relevance: In this diagnostic study, an international competition produced algorithms with high sensitivity for using AI to detect lesions on DBT images. A standardized performance benchmark for the detection task using publicly available clinical imaging data was released, with detailed descriptions and analyses of submitted algorithms accompanied by a public release of their predictions and code for selected methods. These resources will serve as a foundation for future research on computer-assisted diagnosis methods for DBT, significantly lowering the barrier of entry for new researchers.

Subject(s)

Artificial Intelligence , Breast Neoplasms , Humans , Female , Benchmarking , Mammography/methods , Algorithms , Radiographic Image Interpretation, Computer-Assisted/methods , Breast Neoplasms/diagnostic imaging

11.

Racial Disparities in Quantitative MRI for African American and White Men with Prostate Cancer.

Zabihollahy, Fatemeh; Miao, Qi; Sonni, Ida; Vangala, Sitaram; Kim, Harrison; Hsu, William; Sisk, Anthony; Reiter, Robert; Raman, Steven; Sung, Kyunghyun.

Res Sq ; 2023 Feb 15.

Article in English | MEDLINE | ID: mdl-36824946

ABSTRACT

The risk of prostate cancer (PCa) is strongly influenced by race and ethnicity. The purpose of this study is to investigate differences in the diagnostic performance of multiparametric MRI (mpMRI) in African American (AA) and white (W) men. 111 patients (37 AA and 74 W men) were selected from the study's initial cohort of 885 patients after matching age, prostate-specific antigen, and prostate volume. The diagnostic performance of mpMRI was assessed using detection rates (DRs) and positive predictive values (PPVs) with/without combining Ktrans (volume transfer constant) stratified by prostate zones for AA and W sub-cohorts. The DRs of mpMRI for clinically significant PCa (csPCa) lesions in AA and W sub-cohort with PI-RADS scores ≥ 3 were 67.3% vs. 80.3% in the transition zone (TZ; p=0.026) and 81.2% vs. 76.1% in the peripheral zone (PZ; p>0.9). The Ktrans of csPCa in AA men was significantly higher than in W men (0.23±0.08 min-1 vs. 0.19±0.07 min-1; p=0.022). This emphasizes that there are race-related differences in the performance of mpMRI and quantitative MRI measures that are not reflected in age, PSA, and prostate volume.

12.

Variability Among Breast Cancer Risk Classification Models When Applied at the Level of the Individual Woman.

Paige, Jeremy S; Lee, Christoph I; Wang, Pin-Chieh; Hsu, William; Brentnall, Adam R; Hoyt, Anne C; Naeim, Arash; Elmore, Joann G.

J Gen Intern Med ; 38(11): 2584-2592, 2023 08.

Article in English | MEDLINE | ID: mdl-36749434

ABSTRACT

BACKGROUND: Breast cancer risk models guide screening and chemoprevention decisions, but the extent and effect of variability among models, particularly at the individual level, is uncertain. OBJECTIVE: To quantify the accuracy and disagreement between commonly used risk models in categorizing individual women as average vs. high risk for developing invasive breast cancer. DESIGN: Comparison of three risk prediction models: Breast Cancer Risk Assessment Tool (BCRAT), Breast Cancer Surveillance Consortium (BCSC) model, and International Breast Intervention Study (IBIS) model. SUBJECTS: Women 40 to 74 years of age presenting for screening mammography at a multisite health system between 2011 and 2015, with 5-year follow-up for cancer outcome. MAIN MEASURES: Comparison of model discrimination and calibration at the population level and inter-model agreement for 5-year breast cancer risk at the individual level using two cutoffs (≥ 1.67% and ≥ 3.0%). KEY RESULTS: A total of 31,115 women were included. When using the ≥ 1.67% threshold, more than 21% of women were classified as high risk for developing breast cancer in the next 5 years by one model, but average risk by another model. When using the ≥ 3.0% threshold, more than 5% of women had disagreements in risk severity between models. Almost half of the women (46.6%) were classified as high risk by at least one of the three models (e.g., if all three models were applied) for the threshold of ≥ 1.67%, and 11.1% were classified as high risk for ≥ 3.0%. All three models had similar accuracy at the population level. CONCLUSIONS: Breast cancer risk estimates for individual women vary substantially, depending on which risk assessment model is used. The choice of cutoff used to define high risk can lead to adverse effects for screening, preventive care, and quality of life for misidentified individuals. Clinicians need to be aware of the high false-positive and false-negative rates and variation between models when talking with patients.

Subject(s)

Breast Neoplasms , Humans , Female , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/epidemiology , Mammography/adverse effects , Risk Factors , Quality of Life , Early Detection of Cancer , Risk Assessment

13.

CTFlow: Mitigating Effects of Computed Tomography Acquisition and Reconstruction with Normalizing Flows.

Wei, Leihao; Yadav, Anil; Hsu, William.

Med Image Comput Comput Assist Interv ; 14226: 413-422, 2023 Oct.

Article in English | MEDLINE | ID: mdl-38737498

ABSTRACT

Mitigating the effects of image appearance due to variations in computed tomography (CT) acquisition and reconstruction parameters is a challenging inverse problem. We present CTFlow, a normalizing flows-based method for harmonizing CT scans acquired and reconstructed using different doses and kernels to a target scan. Unlike existing state-of-the-art image harmonization approaches that only generate a single output, flow-based methods learn the explicit conditional density and output the entire spectrum of plausible reconstruction, reflecting the underlying uncertainty of the problem. We demonstrate how normalizing flows reduces variability in image quality and the performance of a machine learning algorithm for lung nodule detection. We evaluate the performance of CTFlow by 1) comparing it with other techniques on a denoising task using the AAPM-Mayo Clinical Low-Dose CT Grand Challenge dataset, and 2) demonstrating consistency in nodule detection performance across 186 real-world low-dose CT chest scans acquired at our institution. CTFlow performs better in the denoising task for both peak signal-to-noise ratio and perceptual quality metrics. Moreover, CTFlow produces more consistent predictions across all dose and kernel conditions than generative adversarial network (GAN)-based image harmonization on a lung nodule detection task. The code is available at https://github.com/hsu-lab/ctflow.

14.

Multivariate Sequential Analytics for Cardiovascular Disease Event Prediction.

Hsu, William; Warren, Jim; Riddle, Patricia.

Methods Inf Med ; 61(S 02): e149-e171, 2022 12.

Article in English | MEDLINE | ID: mdl-36564011

ABSTRACT

BACKGROUND: Automated clinical decision support for risk assessment is a powerful tool in combating cardiovascular disease (CVD), enabling targeted early intervention that could avoid issues of overtreatment or undertreatment. However, current CVD risk prediction models use observations at baseline without explicitly representing patient history as a time series. OBJECTIVE: The aim of this study is to examine whether by explicitly modelling the temporal dimension of patient history event prediction may be improved. METHODS: This study investigates methods for multivariate sequential modelling with a particular emphasis on long short-term memory (LSTM) recurrent neural networks. Data from a CVD decision support tool is linked to routinely collected national datasets including pharmaceutical dispensing, hospitalization, laboratory test results, and deaths. The study uses a 2-year observation and a 5-year prediction window. Selected methods are applied to the linked dataset. The experiments performed focus on CVD event prediction. CVD death or hospitalization in a 5-year interval was predicted for patients with history of lipid-lowering therapy. RESULTS: The results of the experiments showed temporal models are valuable for CVD event prediction over a 5-year interval. This is especially the case for LSTM, which produced the best predictive performance among all models compared achieving AUROC of 0.801 and average precision of 0.425. The non-temporal model comparator ridge classifier (RC) trained using all quarterly data or by aggregating quarterly data (averaging time-varying features) was highly competitive achieving AUROC of 0.799 and average precision of 0.420 and AUROC of 0.800 and average precision of 0.421, respectively. CONCLUSION: This study provides evidence that the use of deep temporal models particularly LSTM in clinical decision support for chronic disease would be advantageous with LSTM significantly improving on commonly used regression models such as logistic regression and Cox proportional hazards on the task of CVD event prediction.

Subject(s)

Cardiovascular Diseases , Humans , Cardiovascular Diseases/epidemiology , Risk Factors , Risk Assessment/methods , Neural Networks, Computer , Multivariate Analysis

15.

Automated estimation of ischemic core volume on noncontrast-enhanced CT via machine learning.

Chen, Iris E; Tsui, Brian; Zhang, Haoyue; Qiao, Joe X; Hsu, William; Nour, May; Salamon, Noriko; Ledbetter, Luke; Polson, Jennifer; Arnold, Corey; BahrHossieni, Mersedeh; Jahan, Reza; Duckwiler, Gary; Saver, Jeffrey; Liebeskind, David; Nael, Kambiz.

Interv Neuroradiol ; : 15910199221145487, 2022 Dec 26.

Article in English | MEDLINE | ID: mdl-36572984

ABSTRACT

BACKGROUND: Accurate estimation of ischemic core on baseline imaging has treatment implications in patients with acute ischemic stroke (AIS). Machine learning (ML) algorithms have shown promising results in estimating ischemic core using routine noncontrast computed tomography (NCCT). OBJECTIVE: We used an ML-trained algorithm to quantify ischemic core volume on NCCT in a comparative analysis to pretreatment magnetic resonance imaging (MRI) diffusion-weighted imaging (DWI) in patients with AIS. METHODS: Patients with AIS who had both pretreatment NCCT and MRI were enrolled. An automatic segmentation ML approach was applied using Brainomix software (Oxford, UK) to segment the ischemic voxels and calculate ischemic core volume on NCCT. Ischemic core volume was also calculated on baseline MRI DWI. Comparative analysis was performed using Bland-Altman plots and Pearson correlation. RESULTS: A total of 72 patients were included. The time-to-stroke onset time was 134.2/89.5 minutes (mean/median). The time difference between NCCT and MRI was 64.8/44.5 minutes (mean/median). In patients who presented within 1 hour from stroke onset, the ischemic core volumes were significantly (p = 0.005) underestimated by ML-NCCT. In patients presented beyond 1 hour, the ML-NCCT estimated ischemic core volumes approximated those obtained by MRI-DWI and with significant correlation (r = 0.56, p < 0.001). CONCLUSION: The ischemic core volumes calculated by the described ML approach on NCCT approximate those obtained by MRI in patients with AIS who present beyond 1 hour from stroke onset.

16.

Medication adherence prediction through temporal modelling in cardiovascular disease management.

Hsu, William; Warren, James R; Riddle, Patricia J.

BMC Med Inform Decis Mak ; 22(1): 313, 2022 11 29.

Article in English | MEDLINE | ID: mdl-36447245

ABSTRACT

BACKGROUND: Chronic conditions place a considerable burden on modern healthcare systems. Within New Zealand and worldwide cardiovascular disease (CVD) affects a significant proportion of the population and it is the leading cause of death. Like other chronic diseases, the course of cardiovascular disease is usually prolonged and its management necessarily long-term. Despite being highly effective in reducing CVD risk, non-adherence to long-term medication continues to be a longstanding challenge in healthcare delivery. The study investigates the benefits of integrating patient history and assesses the contribution of explicitly temporal models to medication adherence prediction in the context of lipid-lowering therapy. METHODS: Data from a CVD risk assessment tool is linked to routinely collected national and regional data sets including pharmaceutical dispensing, hospitalisation, lab test results and deaths. The study extracts a sub-cohort from 564,180 patients who had primary CVD risk assessment for analysis. Based on community pharmaceutical dispensing record, proportion of days covered (PDC) [Formula: see text] 80 is used as the threshold for adherence. Two years (8 quarters) of patient history before their CVD risk assessment is used as the observation window to predict patient adherence in the subsequent 5 years (20 quarters). The predictive performance of temporal deep learning models long short-term memory (LSTM) and simple recurrent neural networks (Simple RNN) are compared against non-temporal models multilayer perceptron (MLP), ridge classifier (RC) and logistic regression (LR). Further, the study investigates the effect of lengthening the observation window on the task of adherence prediction. RESULTS: Temporal models that use sequential data outperform non-temporal models, with LSTM producing the best predictive performance achieving a ROC AUC of 0.805. A performance gap is observed between models that can discover non-linear interactions between predictor variables and their linear counter parts, with neural network (NN) based models significantly outperforming linear models. Additionally, the predictive advantage of temporal models become more pronounced when the length of the observation window is increased. CONCLUSION: The findings of the study provide evidence that using deep temporal models to integrate patient history in adherence prediction is advantageous. In particular, the RNN architecture LSTM significantly outperforms all other model comparators.

Subject(s)

Cardiovascular Diseases , Humans , Cardiovascular Diseases/drug therapy , Medication Adherence , Hospitalization , Neural Networks, Computer , Pharmaceutical Preparations

17.

External Validation of an Ensemble Model for Automated Mammography Interpretation by Artificial Intelligence.

Hsu, William; Hippe, Daniel S; Nakhaei, Noor; Wang, Pin-Chieh; Zhu, Bing; Siu, Nathan; Ahsen, Mehmet Eren; Lotter, William; Sorensen, A Gregory; Naeim, Arash; Buist, Diana S M; Schaffter, Thomas; Guinney, Justin; Elmore, Joann G; Lee, Christoph I.

JAMA Netw Open ; 5(11): e2242343, 2022 11 01.

Article in English | MEDLINE | ID: mdl-36409497

ABSTRACT

Importance: With a shortfall in fellowship-trained breast radiologists, mammography screening programs are looking toward artificial intelligence (AI) to increase efficiency and diagnostic accuracy. External validation studies provide an initial assessment of how promising AI algorithms perform in different practice settings. Objective: To externally validate an ensemble deep-learning model using data from a high-volume, distributed screening program of an academic health system with a diverse patient population. Design, Setting, and Participants: In this diagnostic study, an ensemble learning method, which reweights outputs of the 11 highest-performing individual AI models from the Digital Mammography Dialogue on Reverse Engineering Assessment and Methods (DREAM) Mammography Challenge, was used to predict the cancer status of an individual using a standard set of screening mammography images. This study was conducted using retrospective patient data collected between 2010 and 2020 from women aged 40 years and older who underwent a routine breast screening examination and participated in the Athena Breast Health Network at the University of California, Los Angeles (UCLA). Main Outcomes and Measures: Performance of the challenge ensemble method (CEM) and the CEM combined with radiologist assessment (CEM+R) were compared with diagnosed ductal carcinoma in situ and invasive cancers within a year of the screening examination using performance metrics, such as sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC). Results: Evaluated on 37â¯317 examinations from 26â¯817 women (mean [SD] age, 58.4 [11.5] years), individual model AUROC estimates ranged from 0.77 (95% CI, 0.75-0.79) to 0.83 (95% CI, 0.81-0.85). The CEM model achieved an AUROC of 0.85 (95% CI, 0.84-0.87) in the UCLA cohort, lower than the performance achieved in the Kaiser Permanente Washington (AUROC, 0.90) and Karolinska Institute (AUROC, 0.92) cohorts. The CEM+R model achieved a sensitivity (0.813 [95% CI, 0.781-0.843] vs 0.826 [95% CI, 0.795-0.856]; P = .20) and specificity (0.925 [95% CI, 0.916-0.934] vs 0.930 [95% CI, 0.929-0.932]; P = .18) similar to the radiologist performance. The CEM+R model had significantly lower sensitivity (0.596 [95% CI, 0.466-0.717] vs 0.850 [95% CI, 0.766-0.923]; P < .001) and specificity (0.803 [95% CI, 0.734-0.861] vs 0.945 [95% CI, 0.936-0.954]; P < .001) than the radiologist in women with a prior history of breast cancer and Hispanic women (0.894 [95% CI, 0.873-0.910] vs 0.926 [95% CI, 0.919-0.933]; P = .004). Conclusions and Relevance: This study found that the high performance of an ensemble deep-learning model for automated screening mammography interpretation did not generalize to a more diverse screening cohort, suggesting that the model experienced underspecification. This study suggests the need for model transparency and fine-tuning of AI models for specific target populations prior to their clinical adoption.

Subject(s)

Breast Neoplasms , Mammography , Humans , Female , Adult , Middle Aged , Artificial Intelligence , Breast Neoplasms/diagnostic imaging , Retrospective Studies , Early Detection of Cancer

18.

Radiologist Preferences for Artificial Intelligence-Based Decision Support During Screening Mammography Interpretation.

Hendrix, Nathaniel; Lowry, Kathryn P; Elmore, Joann G; Lotter, William; Sorensen, Gregory; Hsu, William; Liao, Geraldine J; Parsian, Sana; Kolb, Suzanne; Naeim, Arash; Lee, Christoph I.

J Am Coll Radiol ; 19(10): 1098-1110, 2022 10.

Article in English | MEDLINE | ID: mdl-35970474

ABSTRACT

BACKGROUND: Artificial intelligence (AI) may improve cancer detection and risk prediction during mammography screening, but radiologists' preferences regarding its characteristics and implementation are unknown. PURPOSE: To quantify how different attributes of AI-based cancer detection and risk prediction tools affect radiologists' intentions to use AI during screening mammography interpretation. MATERIALS AND METHODS: Through qualitative interviews with radiologists, we identified five primary attributes for AI-based breast cancer detection and four for breast cancer risk prediction. We developed a discrete choice experiment based on these attributes and invited 150 US-based radiologists to participate. Each respondent made eight choices for each tool between three alternatives: two hypothetical AI-based tools versus screening without AI. We analyzed samplewide preferences using random parameters logit models and identified subgroups with latent class models. RESULTS: Respondents (n = 66; 44% response rate) were from six diverse practice settings across eight states. Radiologists were more interested in AI for cancer detection when sensitivity and specificity were balanced (94% sensitivity with <25% of examinations marked) and AI markup appeared at the end of the hanging protocol after radiologists complete their independent review. For AI-based risk prediction, radiologists preferred AI models using both mammography images and clinical data. Overall, 46% to 60% intended to adopt any of the AI tools presented in the study; 26% to 33% approached AI enthusiastically but were deterred if the features did not align with their preferences. CONCLUSION: Although most radiologists want to use AI-based decision support, short-term uptake may be maximized by implementing tools that meet the preferences of dissuadable users.

Subject(s)

Breast Neoplasms , Mammography , Artificial Intelligence , Breast Neoplasms/diagnostic imaging , Early Detection of Cancer/methods , Female , Humans , Mammography/methods , Mass Screening , Radiologists

19.

Automated quantitative assessment of amorphous calcifications: Towards improved malignancy risk stratification.

Marathe, Kalyani; Marasinou, Chrysostomos; Li, Beibin; Nakhaei, Noor; Li, Bo; Elmore, Joann G; Shapiro, Linda; Hsu, William.

Comput Biol Med ; 146: 105504, 2022 07.

Article in English | MEDLINE | ID: mdl-35525068

ABSTRACT

BACKGROUND: Amorphous calcifications noted on mammograms (i.e., small and indistinct calcifications that are difficult to characterize) are associated with high diagnostic uncertainty, often leading to biopsies. Yet, only 20% of biopsied amorphous calcifications are cancer. We present a quantitative approach for distinguishing between benign and actionable (high-risk and malignant) amorphous calcifications using a combination of local textures, global spatial relationships, and interpretable handcrafted expert features. METHOD: Our approach was trained and validated on a set of 168 2D full-field digital mammography exams (248 images) from 168 patients. Within these 248 images, we identified 276 image regions with segmented amorphous calcifications and a biopsy-confirmed diagnosis. A set of local (radiomic and region measurements) and global features (distribution and expert-defined) were extracted from each image. Local features were grouped using an unsupervised k-means clustering algorithm. All global features were concatenated with clustered local features and used to train a LightGBM classifier to distinguish benign from actionable cases. RESULTS: On the held-out test set of 60 images, our approach achieved a sensitivity of 100%, specificity of 35%, and a positive predictive value of 38% when the decision threshold was set to 0.4. Given that all of the images in our test set resulted in a recommendation of a biopsy, the use of our algorithm would have identified 15 images (25%) that were benign, potentially reducing the number of breast biopsies. CONCLUSIONS: Quantitative analysis of full-field digital mammograms can extract subtle shape, texture, and distribution features that may help to distinguish between benign and actionable amorphous calcifications.

Subject(s)

Breast Diseases , Breast Neoplasms , Breast/diagnostic imaging , Breast/pathology , Breast Diseases/diagnostic imaging , Breast Neoplasms/diagnostic imaging , Breast Neoplasms/pathology , Female , Humans , Mammography/methods , Radiographic Image Interpretation, Computer-Assisted/methods , Risk Assessment

20.

Olfactory Neuroblastoma: Re-Evaluating the Paradigm of Intracranial Extension and Cyst Formation.

Dumont, Rebecca A; Palma Diaz, Miguel Fernando; Hsu, William; Sepahdari, Ali R.

Diagnostics (Basel) ; 12(3)2022 Mar 01.

Article in English | MEDLINE | ID: mdl-35328167

ABSTRACT

The purpose of the current study was to assess the prevalence of cyst formation at the brain-tumor interface in olfactory neuroblastoma. We used the UCLA patient-based Pathology and Radiology Head and Neck Database (UPP&R HAND) to identify the largest patient cohort reported to date with imaging and pathology data. Eighteen of thirty-one patients (58.1%) had evidence of intracranial extension on MRI, while four (22.0%) demonstrated cyst formation at the brain-tumor interface. The extent of intracranial extension was by far the strongest predictor for intracranial cyst formation, regardless of Hyams tumor grade, using a binary logistics regression model (p = 0.002) and ROC curve analysis (AUC 94.6%). Cyst formation at the brain-tumor interface was an uncommon imaging finding, and tends to occur with a larger component of intracranial tumor extension.

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL